Least-squares temporal difference learning based on extreme learning machine
نویسندگان
چکیده
This paper proposes a least-squares temporal difference (LSTD) algorithm based on extreme learning machine that uses a singlehidden layer feedforward network to approximate the value function. While LSTD is typically combined with local function approximators, the proposed approach uses a global approximator that allows better scalability properties. The results of the experiments carried out on four Markov decision processes show the usefulness of the proposed approach.
منابع مشابه
Ensembles of extreme learning machine networks for value prediction
Value prediction is an important subproblem of several reinforcement learning (RL) algorithms. In a previous work, it has been shown that the combination of least-squares temporal-difference learning with ELM (extreme learning machine) networks is a powerful method for value prediction in continuous-state problems. This work proposes the use of ensembles to improve the approximation capabilitie...
متن کاملAn Internal Model Controller for Three-Phase APF Based on LS-Extreme Learning Machine
Aiming at the problem that the three-phase APF’s dynamic model is a multi-variable, nonlinear and strong coupling system, an internal model controller for three-phase APF based on LS-Extreme Learning Machine is studied in this paper. As a novel single hidden layer feed-forward neural networks, extreme learning machine (ELM) has several advantages: simple net structural, fast learning speed, goo...
متن کاملOn-line Sequential Extreme Learning Machine Based on Recursive Partial Least Squares
This paper proposes the online sequential extreme learning machine algorithm based on the recursive partial leastsquares method (OS-ELM-RPLS). It is an improvement to the online sequential extreme learning machine based on recursive least-squares (OS-ELM-RLS) introduced in [1]. Like in the batch extreme learning machine (ELM), in OSELM-RLS the input weights of a single-hidden layer feedforward ...
متن کاملOn-Line Sequential Extreme Learning Machine
The primitive Extreme Learning Machine (ELM) [1, 2, 3] with additive neurons and RBF kernels was implemented in batch mode. In this paper, its sequential modification based on recursive least-squares (RLS) algorithm, which referred as Online Sequential Extreme Learning Machine (OS-ELM), is introduced. Based on OS-ELM, Online Sequential Fuzzy Extreme Learning Machine (Fuzzy-ELM) is also introduc...
متن کاملKernel Least-Squares Temporal Difference Learning
Kernel methods have attracted many research interests recently since by utilizing Mercer kernels, non-linear and non-parametric versions of conventional supervised or unsupervised learning algorithms can be implemented and usually better generalization abilities can be obtained. However, kernel methods in reinforcement learning have not been popularly studied in the literature. In this paper, w...
متن کامل